A Distributed Fault-Detection and Recovery Protocol for Reliable Multicast Collaborative Communications
نویسندگان
چکیده
Reliable multicast transport protocols support dissemination communication and their through-puts are either limited by the sender or by intermediate nodes that consolidate acknowledgements and retransmissions. RMSP is a portable reliable multicast protocol that supports collaborative communication at the session-layer level and provides higher throughput by maintaining two transport connections per node irrespective of the number of nodes. However, the throughput only of RMSP may degrade if one or more nodes or the centralized connection manager that monitors them fails. In this report, we propose a fully distributed fault-tolerant protocol, DFT-RMSP, that detects and isolates faulty nodes and reinstates the multicast collaborative communications. We derive analytic expressions for throughput degradation and analyze its sensitivity to node-failure rates, network delay, and multicast group size. We show that DFT-RMSP provides less down-time and achieves graceful throughput degradation under high failure rates compared with RMSP. simultaneous node failures.
منابع مشابه
Fault Recovery in the Multicast Protocol
The Reliable Multicast Protocol (RMP) provides a unique, group-based model for distributed programs that need to handle reconfiguration events at the application layer. This model, called membership views, provides an abstraction in which events such as site failures, network partitions, and normal join-leave events are viewed as group reformations. RMP provides access to this model through an ...
متن کاملSpecification and Design of a Fault Recovery Model for the Reliable Multicast Protocol
The Reliable Multicast Protocol (RMP) provides a unique, group-based model for distributed programs that need to handle reconfiguration events at the application layer. This model. called mernbershzp views, provides an abstraction in which events such as site failures, network partitions, and normal join-leave events are viewed as group reformations. RMP provides access to this model through an...
متن کاملSpeci cation and Design of a Fault Recovery Model for the ReliableMulticast Protocol
The Reliable Multicast Protocol (RMP) provides a unique, group-based model for distributed programs that need to handle reconnguration events at the application layer. This model, called membership views, provides an abstraction in which events such as site failures, network partitions, and normal join-leave events are viewed as group reformations. RMP provides access to this model through an a...
متن کاملA reliable ordered delivery protocol for interconnected local area networks
We present the Totem multiple-ring protocol, a novel reliable ordered multicast protocol for multiple interconnected local-area networks. The protocol exhibits excellent performance and maintains a consistent network-wide total order of messages despite network partitioning and remerging, or processor failure and recovery with stable storage intact. The Totem protocol is designed for fault-tole...
متن کاملLow Latency Fault Tolerance System
The Low Latency Fault Tolerance (LLFT) system provides fault tolerance for distributed applications within a local-area network, using a leader-follower replication strategy. LLFT provides application-transparent replication, with strong replica consistency, for applications that involve multiple interacting processes or threads. Its novel system model enables LLFT to maintain a single consiste...
متن کامل